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Abstract.  Organizations  that  collect  and  use  large  volumes  of  personal  infor¬ 
mation  are  expected  under  the  principle  of  accountable  data  governance  to  take 
measures  to  protect  data  subjects  from  risks  that  arise  from  inapproriate  uses  of 
this  information.  In  this  paper,  we  focus  on  a  specific  class  of  mechanisms — 
audits  to  identify  policy  violators  coupled  with  punishments — that  organizations 
such  as  hospitals,  financial  institutions,  and  Web  services  companies  may  adopt 
to  protect  data  subjects  from  privacy  and  security  risks  stemming  from  inappro¬ 
priate  information  use  by  insiders.  We  model  the  interaction  between  the  organi¬ 
zation  (defender)  and  an  insider  (adversary)  during  the  audit  process  as  a  repeated 
game.  We  then  present  an  audit  strategy  for  the  defender.  The  strategy  requires 
the  defender  to  commit  to  its  action  and  when  paired  with  the  adversary’s  best 
response  to  it,  provably  yields  an  asymmetric  subgame  perfect  equilibrium.  We 
then  present  two  mechanisms  for  allocating  the  total  audit  budget  for  inspec¬ 
tions  across  all  games  the  organization  plays  with  different  insiders.  The  first 
mechanism  allocates  budget  to  maximize  the  utility  of  the  organization.  Observ¬ 
ing  that  this  mechanism  protects  the  organization’s  interests  but  may  not  protect 
data  subjects,  we  introduce  an  accountable  data  governance  property,  which  re¬ 
quires  the  organization  to  conduct  thorough  audits  and  impose  punishments  on 
violators.  The  second  mechanism  we  present  achieves  this  property.  We  provide 
evidence  that  a  number  of  parameters  in  the  game  model  can  be  estimated  from 
prior  empirical  studies  and  suggest  specific  studies  that  can  help  estimate  other 
parameters.  Finally,  we  use  our  model  to  predict  observed  practices  in  industry 
(e.g.,  differences  in  punishment  rates  of  doctors  and  nurses  for  the  same  viola¬ 
tion)  and  the  effectiveness  of  policy  interventions  (e.g.,  data  breach  notification 
laws  and  government  audits)  in  encouraging  organizations  to  adopt  accountable 
data  governance  practices. 
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1  Introduction 


Organizations  that  collect  and  use  large  volumes  of  personal  information  are  expected 
under  the  principle  of  accountable  data  governance  to  take  measures  to  protect  data 
subjects  from  risks  that  arise  from  these  uses  of  information  [1,2].  In  this  paper,  we 
focus  on  a  specific  class  of  mechanisms — audits  to  identify  policy  violators  coupled 
with  punishments — that  organizations  such  as  hospitals,  financial  institutions,  and  Web 
services  companies  may  adopt  to  protect  data  subjects  from  privacy  and  security  risks 
stemming  from  inappropriate  information  use  by  authorized  insiders.  Indeed,  commer¬ 
cial  audit  tools  are  emerging  to  assist  in  the  process  of  detecting  inappropriate  informa¬ 
tion  use  by  insiders  [3],  and  reports  of  privacy  policy  violations  and  associated  sanctions 
are  routinely  reported  in  the  healthcare  sector  [4-7]. 

A  central  challenge  in  this  setting  is  the  design  of  effective  audit  and  punishment 
schemes.  We  assume  that  in  each  audit  round  audit  logs  are  first  analyzed  using  an 
automated  tool  that  ranks  actions  by  insiders  as  potential  violations.  Our  focus  is  on  the 
next  step  when  a  subset  of  these  actions  is  inspected  (because  of  budgetary  constraints) 
to  identify  and  punish  policy  violators.  We  seek  to  compute  the  inspection  level  and 
punishment  level  for  an  “effective”  scheme. 

The  challenge  in  modeling  the  complex  interaction  between  the  auditor  and  audited 
agent  includes  making  reasonable  abstractions  and  assumptions.  We  model  the  inter¬ 
action  between  an  organization  (the  defender)  and  the  insider  (the  adversary)  as  a  re¬ 
peated  game  with  imperfect  information  (the  defender  does  not  observe  the  adversary’s 
actions)  and  public  signals  (the  outcome  of  the  audit  is  public).  The  model  captures  a 
number  of  important  economic  considerations  that  influence  the  design  of  audit  mecha¬ 
nisms.  The  game  model  (described  in  Section  3)  replaces  the  byzantine  adversary  model 
in  our  previous  work  [8]  with  a  near-rational  adversary  model.  These  adversaries  act 
rationally  with  high  probability  and  in  a  byzantine  manner  otherwise  (similar  to  a  trem¬ 
bling  hand  assumption  [9]).  Adversaries  benefit  from  violations  they  commit  (e.g.,  by 
selling  personal  data)  and  suffer  due  to  punishments  imposed  for  detected  violations. 
The  model  generalizes  from  the  situation  in  which  the  defender  interacts  with  a  single 
adversary  to  one  where  she  interacts  with  multiple,  non-colluding  adversaries  via  a  nat¬ 
ural  product  game  construction  that  we  define.  Each  audit  game  is  parametrized  by  a 
budget  that  the  defender  can  use  to  conduct  inspections. 

We  then  present  an  audit  strategy  for  the  defender.  This  strategy  when  paired  with 
the  adversary’s  best  response  to  it  provably  yields  an  asymmetric  approximate  subgame 
perfect  equilibrium  (Theorem  1).  This  equilibrium  concept  implies  that  the  adversary 
does  not  gain  at  all  from  deviating  from  her  best  response  strategy  (see  Section  4).  We 
define  this  equilibrium  concept  by  adapting  the  standard  notion  of  approximate  sub¬ 
game  perfect  equilibrium,  which  has  a  symmetric  flavor  and  permits  both  players  to 
obtain  small  gains  by  unilaterally  deviating  from  their  equilibrium  strategy.  The  sym¬ 
metric  equilibrium  concept  is  unsuitable  for  our  security  application,  where  an  adver¬ 
sary  who  deviates  motivated  by  a  small  gain  could  cause  a  big  loss  for  the  organization. 
The  defender’s  strategy  involves  committing  to  a  level  of  inspection  and  punishment. 
The  strategy  has  two  desirable  properties.  First,  the  commitment  results  in  a  predictable 
equilibrium  since  the  adversary  plays  her  best  response  to  the  strategy.  Second,  the 
strategy  is  deterrence  dominant  over  the  set  of  maximum  utility  defender  strategies  that 


result  in  a  perfect  public  equilibrium,  i.e.,  whenever  such  a  strategy  deters  the  adversary, 
so  does  our  audit  strategy  (see  Theorem  2  for  the  formal  statement). 

We  design  two  mechanisms  using  which  the  defender  can  allocate  her  total  audit 
budget  across  the  different  games  to  audit  different  insiders  and  types  of  potential  vio¬ 
lations.  The  first  mechanism  optimizes  the  defender’s  utility.  Observing  that  this  mecha¬ 
nism  protects  the  organization’s  interests  but  may  not  protect  data  subjects,  we  introduce 
an  accountable  data  governance  property,  which  places  an  operational  requirement  on 
the  organization  to  use  a  sufficiently  effective  log  analysis  tool  and  maintain  sufficiently 
high  inspection  and  punishment  rates.  The  second  mechanism  allocates  the  total  audit 
budget  to  achieve  this  property  (see  Section  5). 

Finally,  we  demonstrate  the  usefulness  of  our  model  by  predicting  and  explain¬ 
ing  observed  practices  in  industry  (e.g.,  differences  in  punishment  rates  of  doctors  and 
nurses  for  the  same  violation)  and  analyzing  the  effectiveness  of  policy  interventions 
(e.g.,  data  breach  notification  laws  and  government  audits)  in  encouraging  organizations 
to  adopt  accountable  data  governance  practices  (see  Section  6).  We  present  comparisons 
to  additional  related  work  in  Section  7  and  conclusions  and  directions  for  future  work 
in  Section  8. 


2  Overview 

In  this  section,  we  provide  an  overview  of  our  model  using  a  motivating  scenario  that 
will  serve  as  a  running  example  for  this  paper.  Consider  a  “Hospital  X”  with  employees 
in  different  roles  (doctors,  nurses).  X  conducts  weekly  audits  to  ensure  that  accesses  to 
personal  health  records  are  legitimate.  Given  budget  constraints,  X  cannot  check  every 
single  access.  The  first  step  in  the  audit  process  is  to  analyze  the  access  logs  using 
an  automated  tool  that  ranks  accesses  as  potential  violations.  Hospital  X  assesses  the 
(monetary)  impact  of  different  types  of  violations  and  decides  what  subset  to  focus  on 
by  balancing  the  cost  of  audit  and  the  expected  impact  (“risk”)  from  policy  violations. 
This  type  of  audit  mechanism  is  common  in  practice  [10-13]. 

We  provide  a  game  model  for  this  audit  process.  An  employee  (“adversary,”  A) 
executes  tasks,  i.e.,  actions  that  are  permitted  as  part  of  their  job.  We  only  consider 
tasks  that  can  later  be  audited,  e.g.,  through  inspection  of  logs.  For  example,  in  X  the 
tasks  are  accesses  to  health  records.  We  can  distinguish  „4’s  tasks  between  legitimate 
tasks  and  violations  of  a  policy.  Different  types  of  violations  may  have  different  impact 
on  the  organization.  We  assume  that  there  are  K  different  types  of  violations  that  A  can 
commit.  Examples  of  violations  of  different  types  in  Hospital  X  include  inappropriate 
access  to  a  celebrity’s  health  record,  or  access  to  a  health  record  leading  to  identity 
theft.  A  benefits  by  committing  violations:  the  benefit  is  quantifiable  using  information 
from  existing  studies  or  by  human  judgment.  For  example,  reports  [14, 15]  indicate  that 
on  average  the  personal  benefit  of  a  hospital  employee  from  selling  a  common  person’s 
health  record  is  $50.  On  the  other  hand,  if  A  is  caught  committing  a  violation  then  she 
is  punished  according  to  the  punishment  policy  used  by  V.  For  example,  employees 
could  be  terminated,  as  happened  in  similar  recent  incidents  [6, 7]. 

The  organization  V  can  classify  each  adversary’s  task  by  type.  However,  V  cannot 
determine  with  certainty  whether  a  particular  task  is  legitimate  or  a  violation  without 


investigating.  Furthermore,  V  cannot  inspect  all  of  A’s  tasks  due  to  budgetary  con¬ 
straints.  As  such,  some  violations  may  go  undetected  internally ,  but  could  be  detected 
externally.  Governmental  audits,  whistle-blowing,  patient  complaints  [16,  17]  are  all 
examples  of  situations  that  could  lead  to  external  detection  of  violations.  Externally 
detected  violations  usually  cause  more  economic  damage  to  the  organization  than  in¬ 
ternally  caught  violations.  The  2011  Ponemon  Institute  report  [18]  states  that  patients 
whose  privacy  has  been  violated  are  more  likely  to  leave  (and  possibly  sue)  a  hospital 
if  they  discover  the  violation  on  their  own  than  if  the  hospital  detects  the  violation  and 
proactively  notifies  the  patient. 

The  economic  impact  of  a  violation  is  a  combination  of  direct  and  indirect  costs ;  di¬ 
rect  costs  include  breach  notification  and  remedial  cost,  and  indirect  costs  include  loss 
of  customers  and  brand  value.  For  example,  the  2010  Ponemon  Institute  report  [19] 
states  that  the  average  cost  of  privacy  breach  per  record  in  health  care  is  $301  with 
indirect  costs  about  two  thirds  of  that  amount.  Of  course,  certain  violations  may  re¬ 
sult  in  much  higher  direct  costs,  e.g.,  $25,  000  per  record  (up  to  $250,  000  in  total)  in 
fines  alone  in  the  state  of  California  [6].  These  fines  may  incentivize  organizations  to 
adopt  aggressive  punishments  policies.  However,  severe  punishment  policies  create  a 
hostile  work  environment  resulting  in  economic  losses  for  the  organization  due  to  low 
employee  motivation  and  a  failure  to  attract  new  talent  [20] . 

The  organization  needs  to  balance  auditing  costs,  potential  economic  damages  due 
to  violations  and  the  economic  impact  of  the  punishment  policy.  The  employees  need  to 
weigh  their  gain  from  violating  policies  against  loss  from  getting  caught  by  an  audit  and 
punished.  The  actions  of  one  party  impact  the  actions  of  the  other  party:  if  employees 
never  violate,  the  organization  does  not  need  to  audit;  likewise,  if  the  organization  never 
audits,  employees  can  violate  policies  in  total  impunity.  Given  this  strategic  interdepen¬ 
dency,  we  model  the  auditing  process  as  a  repeated  game  between  the  organization  and 
its  employees,  where  the  discrete  rounds  characterize  audit  cycles.  The  game  is  param¬ 
eterized  by  quantifiable  variables  such  as  the  personal  benefit  of  employee,  the  cost  of 
breach,  and  the  cost  of  auditing,  among  others.  The  organization  is  engaged  in  multiple 
such  games  simultaneously  with  different  employees  and  has  to  effectively  allocate  its 
total  audit  budget  across  the  different  games. 


3  Audit  Game  Model 

We  begin  by  providing  a  high  level  view  of  the  audit  process,  before  describing  the 
audit  game  in  detail  (Section  3).  In  practice,  the  organization  is  not  playing  a  repeated 
audit  game  against  a  specific  employee,  but  against  all  of  its  n  employees  at  the  same 
time.  However,  if  we  assume  that  1)  a  given  employee’s  actions  for  a  type  of  task  are 
independent  of  her  actions  for  other  types,  and  that  2)  employees  do  not  collude  with 
other  employees  and  act  independently,  we  can  decompose  the  overall  game  into  n K 
independent  base  repeated  games,  that  the  organization  plays  in  parallel.  One  base  re¬ 
peated  game  corresponds  to  a  given  type  of  access  A:  by  a  given  employee  A,  and  will 
be  denoted  by  Gam-  Each  game  Gam  is  described  using  many  parameters,  e.g.,  loss 
due  to  violations,  personal  benefit  for  employee,  etc.  We  abuse  notation  in  using  Gam 
to  refer  to  a  base  repeated  game  of  type  k  with  any  value  of  the  parameters. 


In  our  proposed  audit  process  the  organization  follows  the  steps  below  in  each  audit 
cycle  for  every  game  Ga  Assume  the  parameters  of  the  game  have  been  estimated 
and  the  equilibrium  audit  strategy  computed  for  the  first  time  auditing  is  performed. 

before  audit: 

1.  If  any  parameter  changes  go  to  step  2  else  go  to  audit. 

2.  Estimate  parameters.  Compute  equilibrium  of  Gam- 
audit : 

3.  Audit  using  actions  of  the  computed  equilibrium. 

Note  that  the  parameters  of  Gam  may  change  for  any  given  round  of  the  game,  resulting 
in  a  different  game.  However,  neither  V  nor  A  knows  when  that  will  happen.  As  such, 
since  the  horizon  of  Gam  with  a  fixed  set  of  parameters  is  infinite,  we  can  describe 
the  interaction  between  the  organization  and  its  employees  with  an  infinitely  repeated 
game  for  the  period  in  which  the  parameters  are  unchanged  (see  [9]  for  details).  Thus, 
the  game  Gam  is  an  infinitely  repeated  game  of  imperfect  information  since  A’s  action 
is  not  directly  observed.  Instead,  noisy  information  about  the  action,  called  a  public  sig¬ 
nal  is  observed.  The  public  signal  here  consists  of  a)  the  detected  violations  b)  number 
of  tasks  by  A  and  c)  D’s  action.  The  K  parallel  games  played  between  A  and  V  can 
be  composed  in  a  natural  manner  into  one  repeated  game  (which  we  call  Ga)  by  taking 
the  product  of  action  spaces  and  adding  up  utilities  from  the  games. 

Finally,  analyzing  data  to  detect  changes  of  parameters  may  require  the  use  of  sta¬ 
tistical  methods  [21],  data  mining  and  learning  techniques.  We  do  not  delve  into  details 
of  these  methods  as  that  is  beyond  the  scope  of  this  paper  and  estimating  risk  parame¬ 
ters  has  been  studied  extensively  in  many  contexts  [10-13, 15].  Observe  that  change  of 
parameters  may  change  the  equilibrium  of  the  game,  e.g.,  a  lot  of  violations  in  quick 
succession  by  an  employee  (in  spite  of  being  inspected  sufficiently)  may  result  in  the 
organization  changing  the  personal  benefit  of  the  employee  leading  to  more  inspection. 

Formal  Description  In  the  remainder  of  this  section,  we  focus  on  the  base  repeated 
games  Gam-  We  use  the  following  notations  in  this  paper: 

•  Vectors  are  represented  with  an  arrow  on  top,  e.g.,  v  is  a  vector.  The  ith  component 
of  a  vector  is  given  by  v  (i).  v  <  a  means  that  both  vectors  have  the  same  number  of 
components  and  for  any  component  i,  v  (i)  <  a  (i). 

•  Random  variables  are  represented  in  boldface,  e.g.,  x  and  X  are  random  variables. 

•  A(X  )  \q.  r]  denotes  the  expected  value  of  random  variable  X,  when  particular  parame¬ 
ters  of  the  probability  mass  function  of  X  are  set  to  q  and  r. 

•  We  will  use  a  shorthand  form  by  dropping  A,  k  and  the  vector  notation,  as  we  assume 
these  are  implicitly  understood  for  the  game  Gam ,  be-,  a  quantity  XA{k)  will  be  simply 
denoted  as  x.  We  use  this  form  whenever  the  context  is  restricted  to  game  Gam  only- 

Gam  is  fully  defined  by  the  players,  the  time  granularity  at  which  the  game  is  played, 
the  actions  the  players  can  take,  and  the  utility  the  players  obtain  as  a  result  of  the 
actions  they  take.  We  next  discuss  these  different  concepts  in  turn. 

Players:  The  game  Gam 's  played  between  the  organization  V  and  an  adversary  A.  For 
instance,  the  players  are  hospital  X  and  a  nurse  in  X. 

Round  of  play:  In  practice,  audits  for  all  employees  and  all  types  of  access  are  per¬ 
formed  together  and  usually  periodically.  Thus,  we  adopt  a  discrete-time  model,  where 


time  points  are  associated  with  rounds.  Each  round  of  play  corresponds  to  an  audit  cy¬ 
cle.  We  group  together  all  of  the  A’s  actions  (tasks  of  a  given  type)  in  a  given  round.  All 
games  Gaa  are  synchronized,  i.e.,  all  rounds  t  in  all  games  are  played  simultaneously. 
Adversary  action  space:  In  each  round,  the  adversary  A  chooses  two  quantities  of 
type  k:  the  number  of  tasks  she  performs,  and  the  number  of  such  tasks  that  are  vio¬ 
lations.  If  we  denote  by  [4  the  maximum  number  of  type  k  tasks  that  any  employee 
can  perform,  then  _4’s  entire  action  space  for  Gaa  is  given  by  Ak  x  14  with  Ak  = 
{ufc, . . . ,  Uk}  ( Uk  <  Uk)  and  14  =  {1,  -  -  - ,  £4}-  Let  akA  and  v \  be  vectors  of  length 
I\  such  that  the  components  of  vector  a  are  the  number  of  tasks  of  each  type  that  A 
performs  at  time  t,  and  the  components  of  vector  v  are  the  number  of  violations  of  each 
type.  Since  violations  are  a  subset  of  all  tasks,  we  always  have  ^A  <  *A-  In  a  given 
audit  cycle,  A’s  action  in  the  game  Gaa  is  defined  by  (aA(k) ,  u^(fc)),  that  is  (a4,  v4) 
in  shorthand  form,  with  a*  £  Ak  and  v*  £  14  • 

Instead  of  being  perfectly  rational,  we  model  A  as  playing  with  a  trembling  hand  [9]. 
Whenever  A  chooses  to  commit  vf  violations  in  as  given  round  t,  she  does  so  with 
probability  1  —  eth,  but,  with  (small)  probability  etj,  she  commits  some  other  number 
of  violations  sampled  from  an  unknown  distribution  Uj]  over  all  possible  violations.  In 
other  words,  we  allow  A  to  act  completely  arbitrarily  when  she  makes  a  mistake.  For 
instance,  a  nurse  in  X  may  lose  her  laptop  containing  health  records  leading  to  a  breach. 
Defender  action  space:  V  also  chooses  two  quantities  of  type  k  in  each  round:  the 
number  of  inspections  to  perform,  and  the  punishment  to  levy  for  each  type-/,:  violation 
detected.  Let  Aa  be  the  vector  of  length  K  such  that  components  of  vector  Aa  are 
the  number  of  inspections  of  each  type  that  V  performs  in  round  t.  The  number  of 
inspections  that  V  can  conduct  is  bounded  by  the  number  of  tasks  that  A  performs, 
and  thus,  <  a^.  V  uses  a  log  analysis  tool  A4  to  sort  accesses  according  to  the 
probability  of  them  being  a  violation.  Then,  V  chooses  the  top  s4(fc)  =  s4  tasks  from 
the  sorted  output  of  M  to  inspect  in  game  Gam-  Inspection  is  assumed  perfect,  i.e., 
if  a  violation  is  inspected,  it  is  detected.  The  number  of  inspections  is  bounded  by 
budgetary  constraints.  Denoting  the  functions  that  outputs  cost  of  inspection  for  each 
type  of  violation  by  C ,  we  have  C(/c)(s44 (fc))  <  where  b\{k)  defines  a  per- 

employee,  per-type  budget  constraint.  The  budget  allocation  problem  is  an  optimization 
problem  depending  on  the  audit  strategy,  which  we  discuss  is  Section  5.1. 

V  also  chooses  a  punishment  rate  P\{k)  =  P4  (fine  per  violation  of  type  k )  in  each 
round  t  to  punish  A  if  violations  of  type  k  are  detected.  P4  is  bounded  by  a  maximum 
punishment  Pf  corresponding  to  the  employee  being  fired,  and  the  game  terminated. 

Finally,  P’s  choice  of  the  inspection  action  can  depend  only  on  A’s  total  number  of 
tasks,  since  the  number  of  violations  is  not  observed.  Thus,  V  can  choose  its  strategy  as 
a  function  from  number  of  tasks  to  inspections  and  punishment  even  before  A  performs 
its  action.  In  fact,  we  simulate  V  acting  first  and  the  actions  are  observable  by  requiring 

V  to  commit  to  a  strategy  and  provide  a  proof  of  honoring  the  commitment.  Specifically, 

V  computes  its  strategy,  makes  it  public  and  provides  a  proof  of  following  the  strategy 
after  auditing  is  done.  The  proof  can  be  provided  by  maintaining  an  audit  trail  of  the 
audit  process  itself. 

Outcomes:  We  define  the  outcome  of  a  single  round  of  Gaa  as  the  number  of  vio¬ 
lations  detected  in  internal  audit  and  the  number  of  violations  detected  externally.  We 


assume  that  there  is  a  fixed  exogenous  probability  p  (0  <  p  <  1)  of  an  internally  unde¬ 
tected  violation  getting  caught  externally.  Due  to  the  probabilistic  nature  of  all  quanti¬ 
ties,  the  outcome  is  a  random  variable.  Let  be  the  vector  of  length  K  such  that  the 
oUW  =  ot  represents  the  outcome  for  the  tth  round  for  the  game  Gam-  Then  O'  is  a 
tuple  (0*nt,  0\xf)  of  violations  caught  internally  and  externally.  As  stated  earlier,  we 
assume  the  use  of  a  log  analysis  tool  A4  to  rank  the  accesses  with  more  likely  viola¬ 
tions  being  ranked  higher.  Then,  the  probability  mass  function  for  0\nt  is  a  distribution 
parameterized  by  (a*,  v f),  s  and  AL  The  baseline  performance  of  A4  is  when  the  s  ac¬ 
cesses  to  be  inspected  are  chosen  at  random,  resulting  in  a  hyper-geometric  distribution 
with  mean  v tat,  where  a*  =  st/at.  We  assume  that  the  mean  of  the  distribution  is 
p(at)vtat,  where  p{at)  is  a  function  dependent  on  a*  that  measures  the  performance 
of  M  and  Va*  £  [0, 1].  p  >  p(at)  >  1  for  some  constant  p  (p  is  overloaded  here). 
Note  that  we  must  have  p{at)at  <  1,  and  further,  we  assume  that  p(al)  is  monotoni- 
cally  non-increasing  in  a*.  The  probability  mass  function  for  0\xt  conditioned  on  0\nt 
is  a  binomial  distribution  parameterized  by  p. 

Utility  functions:  In  a  public  signaling  game  like  Gam ,  the  utilities  of  the  players  de¬ 
pend  only  on  the  public  signal  and  their  own  action,  while  the  strategies  they  choose 
depend  on  the  history  of  public  signals  [22],  The  utility  of  the  repeated  game  is  defined 
as  a  (delta-discounted)  sum  of  the  expected  utilities  received  in  each  round,  where  the 
expectation  is  taken  with  respect  to  the  distribution  over  histories.  Let  the  discount  fac¬ 
tor  for  V  be  5x>  and  for  any  employee  A  be  5a-  We  assume  that  V  is  patient,  i.e.,  future 
rewards  are  almost  as  important  as  immediate  rewards,  and  5x>  is  close  to  1.  A  is  less 
patient  than  V  and  hence  5a  <  &d- 

Defender  utility  function:  V's  utility  in  a  round  of  the  game  Gam  consists  of  the  sum 
of  the  cost  of  inspecting  AG  actions,  the  monetary  loss  from  a  high  punishment  rate 
for  A,  and  direct  and  indirect  costs  of  violations.  As  discussed  before,  inspection  costs 
are  given  by  C(s*)  where  C  =  C(k)  is  a  function  denoting  the  cost  of  inspecting 
type-/.:  tasks.  Similarly,  the  monetary  loss  from  losing  employee’s  productivity  due  to 
fear  of  punishment  is  given  by  e(Pt),  where  e  =  £?a(/c)  is  a  function  for  type-fc  tasks. 
The  functions  in  C  and  e  must  satisfy  the  following  constraints:  1)  they  should  be 
monotonically  increasing  in  the  argument  and  2)  C(k)  >  0,  CA{k)  >  0  for  all  k. 

We  characterize  the  effect  of  violations  on  the  organization’s  indirect  cost  similarly 
to  the  reputation  loss  as  in  previous  work  [8],  Additionally,  the  generic  function  de¬ 
scribed  below  is  capable  of  capturing  direct  costs,  as  shown  in  the  example  following 
the  function  specification.  Specifically,  we  define  a  function  />  (r  in  shorthand  form) 
that,  at  time  t,  takes  as  input  the  number  of  type-/,:  violations  caught  internally,  the  num¬ 
ber  of  type-/,:  violations  caught  externally,  and  a  time  horizon  r,  and  outputs  the  overall 
loss  at  time  t  +  t  due  to  these  violations  at  time  t.  r  is  stationary  (i.e.,  independent  of 
t),  and  externally  caught  violations  have  a  stronger  impact  on  r  than  internally  detected 
violations.  Further,  r((0, 0),r)  =  0  for  any  r  (undetected  violations  have  0  cost),  and 
r  is  monotonically  decreasing  in  r  and  becomes  equal  to  zero  for  r  >  to  (violations 
are  forgotten  after  a  finite  amount  of  rounds).  As  in  previous  work  [8],  we  construct 
the  utility  function  at  round  t  by  immediately  accounting  for  future  losses  due  to  vio¬ 
lations  occurring  at  time  t.  This  allows  us  to  use  standard  game-theory  results,  while  at 
the  same  time,  providing  a  close  approximation  of  the  defender’s  loss  [8],  With  these 


notations,  P’s  utility  at  time  t  in  Ga.J:  is 


m— 1 

Rew^«S4,P4>,  O')  =  -  Y,  \j)  -  Ctf)  -  e(P4)  .  (1) 

3=0 

This  per-round  utility  is  always  negative  (or  at  most  zero).  As  is  typical  of  security 
games  (e.g.,  [23, 24]  and  related  work),  implementing  security  measures  does  not  pro¬ 
vide  direct  benefits  to  the  defender,  but  is  necessary  to  pare  possible  losses.  Hence,  the 
goal  for  the  defender  is  to  have  this  utility  as  close  to  zero  as  possible. 

The  above  function  can  capture  direct  costs  of  violations  as  an  additive  term  at  time 
r  =  0.  As  a  simple  example  [8],  assuming  the  average  direct  costs  for  internally  and 
externally  caught  violations  are  given  by  Rfnt  and  R®xt,  and  the  function  r  is  linear  in 
the  random  variables  0\nt  and  Olext,  r  can  be  given  by 


r{  Ot,r) 


(c  +  Rfnt)°int  +  (V>c  +  R?xt)°ixt  for  r  =  0 
STc( 0\nt  +  % b  ■  0*exi)  for  1  <  t  <  m 

0  for  r  >  m, 


where  S  £  (0, 1)  and  if;  >  1.  Then  Eqn.  (1)  reduces  to 

Rew^((s4,P4),  O4)  =  -Rint 0\nt  -  RextOlxt  -  C(s4)  -  e(P4)  ,  (2) 

with  Rint  =  R\nt  +  Rfnt,  R\nt  =  c(l  -  6m5%)/(  1  -  66v)  and  Rext  =  tfR\nt  +  Rgt. 


Adversary  utility  function:  We  define  A’s  utility  as  the  sum  of  AA  personal  benefit 
gained  by  committing  violations  and  the  punishment  that  results  due  to  detected  viola¬ 
tions.  Personal  benefit  is  a  monetary  measure  of  the  benefit  that  A  gets  out  of  violations. 
It  includes  all  kinds  of  benefits,  e.g.,  curiosity,  actual  monetary  benefit  (by  selling  pri¬ 
vate  data),  revenge,  etc.  It  is  natural  that  true  personal  benefit  of  A  is  only  known  to  A. 
Our  model  of  personal  benefit  of  A  is  linear  and  is  defined  by  a  rate  of  personal  benefit 
for  each  type  of  violation  given  by  the  vector  I  a  of  length  K .  The  punishment  is  the 
vector  P^  of  length  K  chosen  by  V,  as  discussed  above.  Using  shorthand  notation,  _4’s 
utility,  for  the  game  Ga,Ic,  *s: 


Rew^((fl‘,  v%  (A,  P4>,  O4)  =  It,4  -  P4  (0\nt  +  04,t) 


Observe  that  the  utility  function  of  a  player  depends  on  the  public  signal  (observed 
violations,  P’s  action)  and  the  action  of  the  player,  which  conforms  to  the  definition  of 
a  repeated  game  with  imperfect  information  and  public  signaling.  In  such  games,  the 
expected  utility  is  used  in  computing  equilibria. 

Let  a4  =  s4/a4  and  ^(a4)  =  p{at)at.  Then,  E( 04rat)  =  v(at)vt ,  and  E(0\xt)  = 
pt,4(  1  —  rt(a4)).  The  expected  utilities  in  each  round  then  become: 

I(Rewp)  =  -E^To1  WHO4,  j))[u4,  a4,  a4]  -  C(a4a4)  -  e(P4)  , 
P(Rew^)  =  It,4  —  Ptvt  (i/(a4)  +  p{  1  —  tt(a4)))  . 


The  expected  utility  of  A  depends  only  on  the  level  of  inspection  and  not  on  the  actual 
number  of  inspections.  For  the  example  loss  function  given  by  Eqn.  (2),  the  utility 
function  of  V  becomes: 


E( Rewp)  =  -t,4(Pmti/(a4)  +  Rext.p{  1  -  ^a*)))  -  U(a4a4)  -  e(P4)  . 


In  addition  to  the  action  dependent  utilities  above,  the  players  also  receive  a  fixed  utility 
every  round,  which  is  the  salary  for  A  and  value  generated  by  A  for  P.  Pf  depends  on 
these  values,  and  is  calculated  in  Appendix  B.2.  Finally,  the  model  parameters  that  may 
change  over  time  are  Rext,  Pint-  !>•  function  C,  function  e,  function  p  and  /. 


Fig.  1.  Non-deterred  (x)  and  deterred  (+) 
region  for  I  =  $6 .  I  =  $11  has  empty  de¬ 
terred  region. 


Graphical  representation:  A  graphical  representation  of  the  utilities  helps  illustrate 
the  ideas  presented  in  the  next  two  sections.  (See  Figure  1).  Consider  the  2-dimensional 
plane  Ra’P  spanned  by  a #  and  Pt .  We  define  a  feasible  audit  space  in  Ra’P  given  by 
0  <  <A  <  1  and  0  <  P*  <  P/.  P’s  actions  are  points  in  the  feasible  region.  The 
expected  utility  of  the  adversary  in  each  round  is  given  by  vl(I  —  Pt(i,(at)  +  p(  1  — 
^(a*)))).  Thus,  the  curve  in  Ra’P  given  by  I  =  Pt(o(at)  +  p(  1  —  i/(a4)))  is  the 
separator  between  positive  and  negative  expected  utility  regions  for  the  adversary  in 
each  round.  Within  the  feasible  region,  we  call  the  region  of  positive  expected  utility 
the  non-deterred  region  and  the  region  of  negative  utility  the  deterred  region. 

A’s  utility  can  as  well  be  non-linear,  e.g.,  if  V  decides  to  scale  punishment  quadrat- 
ically  with  violations.  Technically,  this  partitions  the  feasible  audit  space  into  many 
regions,  with  each  region  associated  with  the  number  of  violations  that  maximize  the 
utility  of  A  in  that  region.  We  emphasize  that  the  equilibrium  presented  later  can  be 
easily  extended  to  consider  such  cases.  To  keep  the  presentation  simple  we  keep  using 
the  linear  utility  throughout  the  paper,  which  yields  two  regions  associated  with  0  or  all 
violations.  Similarly,  it  is  possible  to  add  any  other  relevant  term  to  P’s  utility,  e.g.,  if 
P  satisfies  a  certain  accountability  criteria  (defined  later  in  Section  5)  then  it  may  earn 
positive  benefit  out  of  increased  reputation. 

Estimation:  Next,  we  describe  techniques  of  estimating  parameters  of  game  Ga.Ic, 
obtaining  sample  estimates  in  the  process.  Before  getting  to  constant  values,  we  state  the 
functions  that  we  use  as  concrete  instances  for  the  examples  in  this  paper.  We  use  simple 
linear  functions  for  audit  cost  ( C(aa )  =  Caa)  and  for  punishment  loss  (e(P)  =  eP). 
The  performance  of  Ai  is  dependent  on  the  tool  being  used  and  we  use  a  linear  function 
for  p(.)  to  get  z'(ct)  =  pa  —  (p  —  l)a2,  where  p  is  a  constant.  Further,  we  use  the 
example  loss  function  (with  Rint  and  Rext)  stated  in  the  last  sub-section.  We  note  that 
our  theorems  work  with  any  function;  these  functions  above  are  the  simplest  functions 
that  satisfy  the  constraints  on  these  functions  stated  in  the  last  sub-section.  Next,  we 
gather  data  from  industry  wide  studies  to  obtain  sample  estimates  for  parameters. 


As  stated  in  Section  2,  values  of  direct  and  indirect  costs  of  violation  (average  of 
Rint  and  R>  xt  is  $300  in  healthcare  [19],  a  detailed  breakdown  is  present  in  the  ANSI 
report  [15]),  maximum  personal  benefit  I  ($50  for  medical  records  [14, 15]),  etc.  are 
available  in  studies.  Also,  in  absence  of  studies  quantitatively  distinguishing  externally 
and  internally  caught  violations  we  assume  Rint  =  Rext  =  $300.  Many  parameters 
depends  on  the  employee,  his  role  in  the  organization  and  type  of  violation.  Keeping  a 
track  of  violations  and  behavior  within  the  organization  offers  a  data  source  for  estimat¬ 
ing  and  detecting  changes  in  these  parameters.  We  choose  values  for  these  parameters 
that  are  not  extremes,  e  =  $10,  /  =  $6,  eth  =  0.03,  5a  =  0.4  and  U/-  =  40.  Further, 
under  certain  assumptions  we  calculate  Pf  (in  Appendix  B.2)  to  get  Pf  =  $10.  Finally, 
the  average  cost  of  auditing  C  and  performance  factor  //  of  log  analysis  tool  should  be 
known  to  V.  We  assume  values  C  =  $50,  and  tool  performance  //,  =  1.5. 


4  Auditing  Strategy 

In  this  section,  we  define  a  suitable  equilibrium  concept  for  the  audit  game  (Section  4.1) 
and  present  a  strategy  for  the  defender  such  that  the  best  response  to  that  strategy  by  the 
adversary  results  in  an  equilibrium  being  attained  (Section  4.2).  Finally,  we  compare 
our  equilibrium  with  other  equilibria  (Section  4.3).  Recall  that  the  equilibrium  of  the 
game  occurs  in  the  period  in  which  the  game  parameters  are  fixed. 

4.1  Equilibrium  Concepts 

We  begin  by  introducing  standard  terminology  from  game  theory.  In  a  one-shot  ex¬ 
tensive  form  game  players  move  in  order.  We  assume  player  1  moves  first  followed  by 
player  2.  An  extensive  form  repeated  game  is  one  in  which  the  round  game  is  a  one-shot 
extensive  game.  The  history  is  a  sequence  of  actions.  Let  H  be  the  set  of  all  possible 
histories.  Let  S)  be  the  action  space  of  player  i.  A  strategy  of  player  i  is  a  function 
<jj  :  Hi  — »  Si,  where  //,  C  H  are  the  histories  in  which  player  i  moves.  The  utility  in 
each  round  is  given  by  r,  :  Si  x  S%  — >  R.  The  total  utility  is  a  ^-discounted  sum  of 
utilities  of  each  round,  normalized  by  1  —  5, . 

The  definition  of  strategies  extends  to  extensive  form  repeated  games  with  public 
signals.  We  consider  a  special  case  here  that  resembles  our  audit  game.  Player  1  moves 
first  and  the  action  is  observed  by  player  2,  then  player  2  moves,  but,  that  action  may 
not  be  perfectly  observed,  instead  resulting  in  a  public  signal.  Let  the  space  of  public 
signals  be  Y .  In  any  round,  the  observed  public  signal  is  distributed  according  to  the 
distribution  AY (.|s),  i.e.,  AY (j/|s)  is  the  probability  of  seeing  signal  y  when  the  action 
profile  s  is  played.  In  these  games,  a  history  is  defined  as  an  alternating  sequence  of 
player  l’a  action  and  public  signals,  ending  in  a  public  signal  for  histories  in  which 
player  1  has  to  move  and  ending  in  player  1  ’s  move  for  histories  in  which  player  2  has 
to  move.  The  actual  utility  in  each  round  is  given  by  the  function  r,  :  S,  x  Y  — >  R. 
The  total  expected  utility  g,  is  the  expected  normalized  ^-discounted  sum  of  utilities 
of  each  round,  where  the  expectation  is  taken  over  the  distribution  over  public  signals 
and  histories.  For  any  history  h,  the  game  to  be  played  in  the  future  after  h  is  called  the 
continuation  game  of  h  with  total  utility  given  by  gi{o,  h). 


A  strategy  profile  (a-\ .  a2)  is  a  subgame  perfect  equilibrium  (SPE)  of  a  repeated 
game  if  it  is  a  Nash  equilibrium  for  all  continuation  games  given  by  any  history  h  [9], 
One  way  of  determining  if  a  strategy  is  a  SPE  is  to  determine  whether  the  strategy 
satisfies  the  single  stage  deviation  property,  that  is,  any  unilateral  deviation  by  any 
player  in  any  single  round  is  not  profitable.  We  define  a  natural  extension  of  SPE,  which 
we  call  asymmetric  subgame  perfect  equilibrium  (or  (ei ,  £2  )-SPE),  which  encompasses 
SPE  as  a  special  case  when  e±  =  £2  =  0. 

Definition  1.  ((ei,  £2  )-SPE)  Denote  concatenation  operator  for  histories  as  Strategy 
profile  a  is  a  (ei,e2  )-SPE  if  for  history  h  in  which  player  1  has  to  play,  given  h!  = 
h ;  <7\{h)  and  h"  =  h;  si, 

E(r1(a1{h),y))[a1(h),a2(h')\  +  S1E(g1(a,  h,;y))[a1(h),a2(h')] 

>  E(r1(s1,y))[s1,a2{h")]  +  S1E(g1(o ,  /t";  y))[si,  a2(h")\  -  £1 

for  all  si.  For  history  h  in  which  player  2  has  to  play,  given  a{h)  is  the  last  action  by 
player  1  in  h,  for  all  s2 

E{r2(a2(h),y))[a(h),a2(h)\  +  S2E(g2(a,h;y))[a(h),a2(h)] 

>  E[r2(s2ly))[a(h) , s2\  +  52E(g2(a,  h\ y))[a(h),  s2]  -  £2 

We  are  particularly  interested  in  (ei,  0)-SPE,  where  player  1  is  the  defender  and  player 
2  is  the  adversary.  By  setting  £2  =  0,  we  ensure  that  a  rational  adversary  will  never 
deviate  from  the  expected  equilibrium  behavior.  Such  equilibria  are  important  in  secu¬ 
rity  games,  since  £2  >  0  could  incentivize  the  adversary  to  deviate  from  her  strategy, 
possibly  resulting  in  significant  loss  to  the  defender. 

The  following  useful  property  about  history-independent  strategies,  which  follows 
directly  from  the  definition,  helps  in  understanding  our  proposed  history-independent 
audit  strategy. 

Property  1.  If  a  strategy  profile  a  is  history-independent,  i.e.,  <7i  {h)  =  oq  ()  and  cr2(h)  = 
a2(a(h))  then  the  condition  to  test  for  SPE  reduces  to  2£(rq(cri(),y))  >  -E(rq(si,y)), 
for  player  1  and  to  E(r2(a2(h),y))  >  E(r2(s2,y)),  for  player  2,  since  gi(o,h;y)  is 
the  same  for  all  y  and  each  i.  Also,  if  E(ri(si,y))  —  E(ri(oi(h),  y))  <  e*  for  all  i  and 
Si  then  a  is  an  (ei,  £2)-SPE  strategy  profile. 

4.2  Equilibrium  in  the  Audit  Game 

We  next  state  an  equilibrium  strategy  profile  for  the  game  Formally,  we  present 

a  0)-SPE  strategy  profile,  and  calculate  the  value  The  proposed  strategy 

relies  on  commitment  by  V  and  computation  of  a  single  round  best  response  by  A.  We 
accordingly  refer  to  this  strategy  profile  as  a  simple  commitment  strategy  profile. 

For  any  equilibrium  to  be  played  out  with  certainty,  players  must  believe  that  the 
strategy  being  used  by  the  other  players  is  the  equilibrium  strategy.  Our  proposed  strat¬ 
egy  profile  has  features  that  aim  to  achieve  correct  beliefs  for  the  players,  even  in  face  of 
partial  rationality.  One  feature  is  that  V  makes  its  strategy  publicly  known,  and  provides 
a  means  to  verify  that  it  is  playing  that  strategy.  As  noted  earlier,  even  though  V  acts 


after  A  does  by  committing  to  its  strategy  with  a  verification  mechanism  T>  simulates  a 
first  move  by  making  the  employee  believe  its  commitment  with  probability  one.  Thus, 
we  envision  the  organization  making  a  commitment  to  stick  to  its  strategy  and  providing 
a  proof  that  it  follows  the  strategy.  Further,  V  making  its  strategy  publicly  known  fol¬ 
lows  the  general  security  principle  of  not  making  the  security  mechanisms  private  [25]. 
Additionally,  the  simple  commitment  strategy  profile  is  an  approximate  SPE  for  all  val¬ 
ues  of  parameters  in  any  game  and  any  value  of  *4’s  discount  factor  dj,.  Thus,  all 
employees  observe  the  organization  following  a  consistent  strategy  further  reducing  any 
variability  in  beliefs  about  the  organization’s  strategy.  Another  important  feature  of  the 
simple  commitment  strategy  profile  is  the  single  round  best  response  computation  by  A 
(yielding  a  single  action  to  play),  which  is  much  simpler  than  optimizing  over  multiple 
rounds  often  yielding  many  strategies  as  the  solution.  Thus,  the  organization  also  trusts 
the  employee  to  make  the  appropriate  decision  even  if  the  employee  is  computationally 
constrained.  The  above  features  of  the  simple  commitment  strategy  profile  makes  the 
strategy  simple,  which  makes  it  more  likely  to  be  followed  in  the  real  world. 

The  main  idea  behind  the  definition  of  our  strategy  profile  is  that  V  optimizes  its 
utility  assuming  the  best  response  of  A  for  a  given  a*.  That  is,  T>  assumes  that  A  does 
not  commit  any  violations  when  (P,  a )  is  in  the  deterred  region,  and  systematically 
commits  a  violation  otherwise  (i.e.,  all  of  _4’s  tasks  are  violations).  Further,  T>  assumes 
the  worst  case  when  the  employee  (with  probability  eth)  accidentally  makes  a  mistake  in 
the  execution  of  their  strategy;  in  such  a  case,  V  expects  all  of  „4’s  tasks  to  be  violations, 
regardless  of  the  values  of  (P,  a).  This  is  because  the  distribution  Dq  over  violations 
when  A  makes  a  mistake  is  unknown.  Thus,  the  expected  cost  function  that  V  optimizes 
(for  each  total  number  of  tasks  a*)  is  a  linear  sum  of  (1  —  eth )  times  the  cost  due  to  best 
response  of  A  and  eth  times  the  cost  when  A  commits  all  violations.  The  expected  cost 
function  is  different  in  the  deterred  and  non-deterred  region  due  to  the  difference  in 
best  response  of  A  in  these  two  regions.  The  boundary  between  the  deterred  and  non- 
deterred  regions  is  conditioned  by  the  value  of  the  adversary’s  personal  benefit  I.  We 
assume  that  V  learns  the  value  of  the  personal  benefit  within  an  error  51  of  its  actual 
value,  and  that  V  does  not  choose  actions  (P,  a )  in  the  region  of  uncertainty  determined 
by  the  error  SI. 

Formally,  the  expected  reward  is  £(Rew^)[0]  when  the  adversary  commits  no 
violation,  and  T(Rewp)[af]  when  all  a 4  tasks  are  violations.  Both  of  these  expected 
rewards  are  functions  of  P,  a;  we  do  not  make  that  explicit  for  notational  ease.  Denote 
the  deterred  region  determined  by  the  parameter  I  and  the  budget  bfA  k  as  It1/)  and  the 
non-deterred  region  as  Either  of  these  regions  may  be  empty.  Denote  the  region 

(of  uncertainty)  between  the  curves  determined  by  I  +  SI  and  I  —  SI  as  RJ6I.  Then  the 
reduced  deterred  region  is  given  by  P£,\PijJ  and  the  reduced  non-deterred  region  by 
RnD\Rsi-  The  equilibrium  strategy  we  propose  is: 

•  For  each  possible  number  of  tasks  a*  that  can  be  performed  by  A,  V  constrained  by 
budget  b\  k,  assumes  the  expected  utility 


UD(P,  a)  =  (1  -  eth)£(Rewp)[0]  +  ethE(RewtT>)[at]  and 
UND(P,  a)  =  (1  -eth)E( Rewp)[of]  +  et/tP(Rew^)[at]  , 


in  Rrn\Rjjj  and  Rj^D\Rgi  respectively.  V  calculates  the  maximum  expected  utility 

across  the  two  regions  as  follows: 

-  C  =  ma'x(P,a)BRID\R,6I  UD{P,  a),  U™  =  max(P,a)eRIND\RISI  Und{P,  a) 

-  U  =  ma x([/^ax,  U££) 

V  commits  to  the  corresponding  maximizer  ( P ,  a)  for  each  a4. 

After  knowing  a4,  V  plays  the  corresponding  (P,  a). 

•  A  plays  her  best  response  (based  on  the  committed  action  of  V),  i.e.,  if  she  is  deterred 
for  all  a4  she  commits  no  violations  and  if  she  is  not  deterred  for  some  a4  then  all  her 
tasks  are  violations,  and  she  chooses  the  a 4  that  maximizes  her  utility  from  violations. 

But,  she  also  commits  mistakes  with  probability  eth,  and  then  the  action  is  determined 
by  distribution  Dg. 

Let  U££cSI  =  max(P,«)eRi,ufl^  Ud{P,  a),  U* °+SI  =  ma x(P;a)e^£)Ufl//  UND(P,  a), 
SUD  =  U^+J1  —  [7^ax  and  SUND  —  U^^+SI  -  (7™.  We  have  the  following  result: 

Theorem  1.  The  simple  commitment  strategy  profile  (defined  above)  is  an  (e.4,fc,0)- 
SPEfor  the  game  GA,k,  where  e^,k  is 


max  ma ie(5Uu).  max(SU 

yv1  ,a*  Dt,at 


ND' 


+  eth  max 
«e[o,i] 


7=0 


SJT,E(r(6tJ))[Uk,Uk,a\ 


Remark  1.  If  the  value  of  any  parameter  of  the  game  (e.g.,  Rext ,  Rint)  is  perturbed  in  a 
bounded  manner,  then  accounting  for  that  in  the  analysis  yields  an  (e,  0)-SPE,  but,  with 
e  greater  than  f  A,k-  This  happens  because  P’s  utility  is  continuous  in  the  parameters. 


The  proof  is  in  Appendix  B.  The  proof  involves  showing  that  the  strategy  profile  has 
the  single  stage  deviation  property.  That  A  does  not  profit  from  deviating  is  imme¬ 
diate  since  A  chooses  the  best  response  in  each  round  of  the  game.  The  bound  on 
profit  from  deviation  for  V  has  two  terms.  The  first  term  arises  due  to  V  ignoring 
the  region  of  uncertainty  in  maximizing  its  utility.  The  maximum  difference  in  util¬ 
ity  for  the  deterred  region  is  max„(  at  —  U® ax)  and  for  the  undeterred  region 

is  max.„t  ai  (U^^+SI  —  The  first  term  is  the  maximum  of  these  quantities.  The 

second  term  arises  due  to  the  use  of  the  worst  case  assumption  of  all  violations  out  of 
maximum  possible  Uk  tasks  when  A  makes  a  mistake  as  compared  to  the  case  when  l)j} 
is  known.  Since  *4’s  choice  only  affects  the  violation  loss  part  of  P’s  utility  and  mis¬ 
takes  happen  with  probability  eth,  the  second  term  is  the  maximum  possible  violation 
loss  multiplied  by  eth- 

Numeric  applications.  The  above  theorem  can  be  used  to  calculate  concrete  values 
for  r_4  /,  when  all  parametric  functions  are  instantiated.  For  example,  with  the  values 
in  Section  3,  we  obtain  e^,k  =  $200.  Assuming  A  performs  the  maximum  Uk  = 
40  number  of  tasks,  e^,k  is  about  9.5%  of  the  cost  of  auditing  all  actions  of  A  with 
maximum  punishment  rate  ($2100),  with  no  violations,  and  about  3.3%  of  the  cost 
incurred  due  to  all  violations  caught  externally  ($6000),  with  no  internal  auditing  or 
punishment.  Similarly,  if  we  assume  70%  audit  coverage  with  maximum  punishment 
and  four  violations,  the  expected  cost  for  organization  is  $2583,  which  means  cam 
corresponds  to  about  7.7%  of  this  cost.  We  present  the  derivation  of  value  of  eA,k  in 
Claim  B  in  Appendix  B.  The  audit  coverage  here  is  for  one  employee  only;  hence  it  can 


be  as  high  as  100%.  Also,  since  Qa  is  a  parallel  composition  of  the  games  Qa ,k  for  all 
fc,  we  claim  that  the  simple  commitment  strategy  profile  followed  for  all  games  Gaj.  is 
a  (Yk  eA,ki  0)-SPE  strategy  profile  for  (l a-  (See  Lemma  1  in  Appendix  B.l. ) 


4.3  Comparision  with  other  equilibria 

In  this  section,  we  compare  our  proposed  strategy  with  the  set  of  Perfect  Public  Equili- 
brum  (PPE)  strategies.  A  PPE  is  the  appropriate  notion  of  equilibrium  in  an  imperfect 
information  repeated  game  with  public  signals  and  simultaneous  moves.  A  PPE  is  quite 
similar  to  a  SPE;  the  differences  are  that  histories  are  sequences  of  public  signals  (in¬ 
stead  of  action  profiles)  and  payoffs  are  considered  in  the  expected  sense.  PPE  strategy 
profiles  also  have  the  single  stage  deviation  property.  As  pointed  out  already,  one  ad¬ 
vantage  of  the  simple  commitment  strategy  is  simplicity.  As  the  set  of  PPE  strategies 
is  often  infinite,  it  is  difficult  for  players’  beliefs  to  agree  on  the  strategy  being  played. 
However,  a  commitment  by  one  player  to  her  part  of  a  PPE  strategy  profile  forces  that 
particular  PPE  to  be  played.  The  organization  is  naturally  the  player  who  commits. 
A  committed  utility  maximizing  player  is  one  who  uses  a  commitment  to  force  the  PPE 
that  yields  the  maximum  payoff  to  that  player.  A  privacy  preserving  defender  is  one  that 
chooses  a  PPE  with  fewer  violations  when  it  has  a  choice  over  multiple  PPE  with  the 
same  payoff  for  the  defender.  The  next  theorem  shows  that  simple  commitment  strat¬ 
egy  deters  A  as  often  as  the  case  in  which  the  chosen  PPE  strategy  deters  A,  assuming 
the  budget  allows  for  deterring  the  employee  and  the  organization  is  committed  utility 
maximizing  and  privacy  preserving  in  choosing  PPE  equilibrium.  Stated  succinctly,  the 
simple  commitment  strategy  profile  is  no  worse  for  privacy  protection  than  choosing 
the  highest  utility  PPE  in  scenarios  where  the  organization  chooses  a  PPE  strategy  that 
deters  the  employee. 

Theorem  2.  Assume  that  budget  is  fixed  in  every  round  and  is  sufficient  to  deter  A,  and 
the  number  of  tasks  performed  by  A  in  every  round  infixed.  Let  v*  be  the  maximum  PPE 
payoff  that  T>  can  obtain.  Further  suppose  there  exists  a  PPE  Em  in  which  T>  always 
plays  some  action  in  the  deterred  region  and  the  utility  for  T>  with  Em  is  v*.  Then  a 
committed  utility  maximizing  and  privacy  preserving  T>  will  choose  to  play  Em.  Further, 
the  action  in  Em  coincides  with  the  action  chosen  by  simple  commitment  strategy  profile 
in  each  round. 


5  Budget  Allocation 

In  this  section  we  present  two  budget  allocation  mechanisms:  one  maximizes  V’s  utility 
(Section  5.1)  and  another  does  the  same  under  accountability  constraints  (Section  5.2). 

5.1  Optimized  Budget  Allocation 

We  assume  the  budget  available  to  V  for  all  audits  is  bound  by  B.  Then  we  must  have 
Y  a  k  %4  (k)  +  C  ost(A4)  <  B ,  where  Cost(A4)  is  a  fixed  cost  of  using  the  log  analysis 
tool  in  an  audit  cycle.  Let  Bm  =  B—Cost(M).  Let  otA,k{fi\{k),  a^(fc)),  PA.kib^k),  a^(fc)) 


be  the  equilibrium  in  game  Ga,h  for  budget  b^(k)  and  A’s  tasks  a^(fc).  Note  that  we 
make  the  dependence  on  IrAk),  a*Ak)  explicit  here.  Let  UijrAk),  <AAk))  denote  the 
corresponding  expected  utility  in  game  Ga.h-  Observe  that  in  equilibrium,  when  A  is 
deterred  for  all  possible  o^(fc)  then  A  has  equal  preference  for  all  possible  a^(/c), 
and  otherwise  A  chooses  the  maximum  o^(fc)  for  which  she  is  undeterred  to  maxi¬ 
mize  her  utility.  Thus,  let  BR(bt_A(k))  be  the  set  of  number  of  tasks  all  of  which  are 
part  of  best  responses  of  A.  Note  that  the  cost  functions  Ud  and  Und  in  deterred  and 
non-deterred  regions  are  continuous  in  since  the  regions  themselves  change  con¬ 

tinuously  with  change  in  b\(k).  Also,  by  definition  they  are  continuous  in  a^(fc).  Since 
U  is  the  maximum  of  two  continuous  functions  U n  and  Und,  using  the  fact  that  max  of 
two  functions  is  continuous,  we  get  that  U  is  continuous  in  both  arguments.  Then,  the 
optimal  allocation  of  budget  is  to  solve  the  following  non-linear  optimization  problem 

maxV''  min  U(b\(k),  a\(k))  subject  to  bfA(k)  >  0  and  V'  <  BM  , 

which  maximizes  the  minimum  utility  possible  over  A’s  possible  best  response  actions. 

For  example,  consider  a  simple  case  with  two  types  of  tasks:  celebrity  records  accesses 
and  non-celebrity  records  accesses,  and  one  employee.  Assume  the  utility  functions  and 
constants  as  stated  at  the  end  of  Section  3,  except,  it  is  assumed  that  it  is  apriori  known 
that  exactly  40  celebrity  and  400  non-celebrity  accesses  would  be  made  and  values  of 
some  constants  (in  brackets)  are  different  for  celebrity  type  ( Rext  =  $4500,  Rint  = 

$300, 1  =  $6,  Pf  =  10)  and  non-celebrity  type  (Rext  =  $90,  Rint  =  $30, 1  =  $0.6, 

Pf  =  5).  Using  discrete  steps  and  a  brute  force  search  yields  a  solution  of  the  above 
optimization  problem  in  which  V  would  allocate  $1300  to  audit  celebrity  accesses  and 
the  remaining  $1200  to  audit  non-celebrity  accesses.  As  the  cost  per  inspection  was 
assumed  $50  (Section  3),  0.65  fraction  of  celebrity  accesses  can  be  inspected  and  only 
24  out  of  400  non-celebrity  accesses  can  be  inspected.  However,  the  equilibrium  yields 
that  no  non-celebrity  inspections  happen  as  the  employee  is  non-deterred  for  the  level  of 
non-celebrity  inspections  possible,  and  0.65  fraction  of  celebrity  accesses  are  inspected. 

5.2  Towards  Accountable  Data  Governance 

While  holding  an  employee  responsible  for  the  violation  she  causes  is  natural,  it  is  dif¬ 
ficult  to  define  accountability  for  the  organization,  as  the  organization  does  not  commit 
violations  directly.  However,  the  organization  influences  the  actual  violator  (employee) 
by  the  choice  of  inspections  and  punishment.  We  use  a  simple  definition  of  accountabil¬ 
ity  for  the  organization,  requiring  a  minimum  level  of  inspection  and  punishment. 

Definition  2.  ((Ad  ,  a,  P)-accountabiIity)  An  organization  satisfies  (Ad,  a,  P)-accountability 
if  1)  its  log  analysis  tool  A4'  satisfies  Ad'  >  Ad,  2)  its  level  of  inspection  satisfies 
a!  >  a,  and  3)  its  punishment  rate  satisfies  P'  >  P. 

Our  definition  assumes  a  partial  ordering  over  log  analysis  tools  Ad.  This  partial 
ordering  could  be  given  from  empirically  computed  accuracy  //  estimates  for  each  log 
analysis  tool  (e.g.,  we  could  say  that  Ad-\  >  Ad-2  if  AA\  is  at  least  as  accurate  as  AA2 
for  each  type  of  access  k).  The  dependence  of  accountability  on  Ad  is  required  as  a 


better  performing  tool  can  detect  the  same  expected  number  of  violations  as  another 
tool  with  worse  performance,  with  a  lower  inspection  level  a.  We  envision  the  above 
accountability  being  proven  by  the  organization  to  a  trusted  third  party  external  auditor 
(e.g..  Government)  by  means  of  a  formal  proof,  in  the  same  manner  as  commitment  is 
demonstrated  to  the  employee. 

To  satisfy  (M,  a,  P)-accountability  an  organization  must  add  the  following  con¬ 
straints  to  its  optimization  problem  from  the  last  sub-section: 

mina^(k)eBR^A)aA,k(bA(k)^A(k))  >  ®(k)  > 

P(k)  for  all  A,  k.  The  first  constraint  ensures  that  the  the  minimum  number  of  inspec¬ 
tions  divided  by  maximum  number  of  tasks  is  greater  than  a  and  the  second  con¬ 
straint  ensures  that  the  minimum  punishment  level  is  higher  that  P(k). 

Continuing  the  example  from  last  sub-section  if  the  minimum  a  and  P  is  specified 
as  0.1  and  1.0  for  both  types  of  accesses,  then  V  would  allocate  $400  to  audit  celebrity 
accesses  and  the  remaining  $2100  to  audit  non-celebrity  accesses.  Since  the  cost  per 
inspection  was  assumed  $50  (Section  3),  0.2  fraction  of  celebrity  accesses  can  be  in¬ 
spected  and  42  out  of  400  non-celebrity  accesses  can  be  inspected.  However,  according 
to  the  equilibrium  40  non-celebrity  inspections  happen  at  punishment  level  of  2.0  as 
the  employee  is  already  deterred  for  that  level  of  non-celebrity  inspections.  In  this  case, 
unlike  the  non-accountable  scenario,  the  values  a,  P  ensure  that  the  privacy  of  common 
person  is  being  protected  even  when  the  organization  has  more  economic  incentives  to 
audit  celebrity  accesses  more  heavily. 

6  Predictions  and  Interventions 

In  this  section,  we  use  our  model  to  predict  observed  practices  in  industry  and  the  effec¬ 
tiveness  of  public  policy  interventions  in  encouraging  organizations  to  adopt  account¬ 
able  data  governance  practices  (i.e.,  conduct  more  thorough  audits)  by  analyzing  the 
equilibrium  audit  strategy  P ,  a  under  varying  parameters.  The  explanation  of  observed 
practices  provides  evidence  that  our  audit  model  is  not  far  from  reality.  We  use  the  val¬ 
ues  of  parameters  and  instantiation  of  functions  given  in  Section  3  (unless  otherwise 
noted).  We  assume  that  the  value  of  personal  benefit  /  is  learned  exactly  and  that  P  and 
a  take  discrete  values,  with  the  discrete  increments  being  0.5  and  0.05,  respectively.  We 
also  assume  for  sake  of  exposition  that  Uk  =  Uk ,  i.e.,  the  number  of  tasks  is  fixed,  there 
is  only  one  type  of  violation  and  the  budget  is  sufficient  to  do  all  possible  inspections. 
Average  cost  Rext  and  probability  p  of  external  detection  of  violation.  We  vary  Rext 
from  $5  to  $3900,  with  Rint  fixed  at  $300.  The  results  are  shown  in  Figure  2.  There  are 
two  cases  shown  in  the  figure:  p  =  0.5  and  p  =  0.9.  The  figure  shows  the  equilibria 
P a  chosen  for  different  values  of  Rext- 

Prediction  1:  Increasing  Rext  and  p  is  an  effective  way  to  encourage  organizations 
to  audit  more.  In  fact,  when  p  *  Rext  is  low  X  may  not  audit  at  all.  Thus,  X  audits  to  pro¬ 
tect  itself  from  greater  loss  incurred  when  violations  are  caught  externally.  Surprisingly, 
the  hospital  may  continue  to  increase  inspection  levels  (incurring  higher  cost)  beyond 
the  minimum  level  necessary  to  deter  a  rational  employee.  Hospital  X  does  so  because 
the  employee  is  not  fully  rational:  even  in  the  deterred  region  there  is  an  eth  probability 
of  violations  occurring. 


Fig.  2.  Separators  for  two  values  of  external 
detection  probability  p  indicated  by  dashed 
lines.  Equilibrium  punishment  and  inspec¬ 
tion  rates  (P,  a)  marked  on  solid  lines  (see 
legend)  as  the  reputation  loss  from  external 
detection  Rext  varies;  the  Rext  values  are 
labeled  above  the  corresponding  equilibrium 
points. 


Suggested  Intervention  1:  Subject  organizations  to  external  audits  and  fines  when 
violations  are  detected.  For  example,  by  awarding  contracts  for  conducting  150  external 
audits  by  2012  [26],  HHS  is  moving  in  the  right  direction  by  effectively  increasing  p. 
This  intervention  is  having  an  impact:  the  2011  Ponemon  study  on  patient  privacy  [27] 
states — “Concerns  about  the  threat  of  upcoming  HHS  HIPAA  audits  and  investigation 
has  affected  changes  in  patient  data  privacy  and  security  programs,  according  to  55 
percent  of  respondents.” 

Prediction  2:  Interventions  that  increase  the  expected  loss  for  both  external  and  in¬ 
ternal  detection  of  violations  are  not  as  effective  in  increasing  auditing  as  those  that 
increase  expected  loss  for  external  detection  of  violations  only.  Table  2  shows  the  equi¬ 
librium  inspection  level  as  Rext  and  Rlnt  are  both  increased  at  the  same  rate.  While  the 
inspection  level  may  initially  increase,  it  quickly  reaches  a  peak.  As  an  example,  con¬ 
sider  the  principle  of  breach  detection  notification  used  in  many  data  breach  laws  [28]. 
The  effect  of  breach  detection  notification  is  to  increase  both  Rlnt  and  Rext  since  no¬ 
tification  happens  for  all  breaches.  While  there  isn’t  sufficient  data  for  our  model  to 
predict  whether  these  laws  are  less  effective  than  external  audits  (see  suggested  study 
below),  prior  empirical  analysis  [28]  indicate  that  the  benefit  in  breach  detection  from 
these  laws  is  only  about  6%  (after  adjusting  for  increased  reporting  of  breaches  due  to 
the  law  itself). 

Suggested  study:  An  empirical  study  that  separately  reports  costs  incurred  when 
violations  are  internally  detected  from  those  that  are  externally  detected  would  be  useful 
in  quantifying  and  comparing  the  effectiveness  of  interventions.  Existing  studies  either 
do  not  speak  of  these  distinct  categories  of  costs  [19,28]  or  hint  at  the  importance  of 
this  distinction  without  reporting  numbers  [16, 17]. 

Punishment  loss  factor  e  and  personal  benefit  I.  Prediction  3:  Employees  with  higher 
value  for  e  (e.g.,  doctors  have  higher  e;  suspending  a  doctor  is  costlier  for  the  hospital 
than  suspending  a  nurse)  will  have  lower  punishment  levels.  If  punishments  were  free, 
i.e.,  e  =  0,  (an  unrealistic  assumption)  X  will  always  keep  the  punishment  rate  at 
maximum  according  to  our  model.  At  higher  punishment  rates  (e  =  1000),  X  will  favor 
increasing  inspections  rather  than  increasing  the  punishment  level  P  (see  Table  1  in 
Appendix  A).  While  we  do  not  know  of  an  industry-wide  study  on  this  topic,  there  is 
evidence  of  such  phenomena  occurring  in  hospitals.  For  example,  in  2011  Vermont’s 
Office  of  Professional  Regulation,  which  licenses  nurses,  investigated  53  allegations 
of  drug  diversion  by  nurses  and  disciplined  20.  In  the  same  year,  the  Vermont  Board 


of  Medical  Practice,  which  regulates  doctors,  listed  1 1  board  actions  against  licensed 
physicians  for  a  variety  of  offenses.  However,  only  one  doctor  had  his  license  revoked 
while  the  rest  were  allowed  to  continue  practicing  [7], 

Prediction  4:  Employees  who  cannot  be  deterred  are  not  punished.  When  the  per¬ 
sonal  benefit  of  the  employee  I  is  high,  our  model  predicts  that  X  chooses  the  pun¬ 
ishment  rate  P  =  0  (because  this  employee  cannot  be  deterred  at  all)  and  increases 
inspection  as  Rext  increases  to  minimize  the  impact  of  violations  by  catching  them 
inside  (see  Table  4  in  Appendix  A).  Note  that  this  is  true  only  for  violations  that  are 
not  very  costly  (as  is  the  case  for  our  choice  of  costs).  If  the  expected  violation  cost  is 
more  than  the  value  generated  by  the  employee,  then  it  is  better  to  fire  the  non-deterred 
employee  (see  Appendix  B.2). 

Audit  cost  C  and  performance  factor  /;  of  log  analysis  tool. 

Prediction  5:  If  audit  cost  C  decreases  or  the  performance  p  of  log  analysis  increases, 
then  the  equilibrium  inspection  level  increases.  The  data  supporting  this  prediction  is 
presented  in  Table  3  and  5  in  Appendix  A.  Intuitively,  it  is  expected  that  if  the  cost 
of  auditing  goes  down  then  organizations  would  audit  more,  given  their  fixed  budget 
allocated  for  auditing.  Similarly,  a  more  efficient  mechanized  audit  tool  will  enable  the 
organization  to  increase  its  audit  efficiency  with  the  fixed  budget.  For  example,  MedAs- 
sets  claims  that  Stanford  Hospitals  and  Clinics  saved  $4  million  by  using  automated 
tools  for  auditing  [29]. 


7  Related  Work 

Auditing  and  Accountability:  Prior  work  studies  orthogonal  questions  of  algorithmic 
detection  of  policy  violations  [30-33]  and  blame  assignment  [34-37].  Feigenbaum  et 
al.  [38]  report  work  in  progress  on  formal  definitions  of  accountability  capturing  the 
idea  that  violators  are  punished  with  or  without  identification  and  mediation  with  non¬ 
zero  probability,  and  punishments  are  determined  based  on  an  understanding  of  “typ¬ 
ical”  utility  functions.  Operational  considerations  of  how  to  design  an  accountability 
mechanism  that  effectively  manages  organizational  risk  is  not  central  to  their  work.  In 
other  work,  auditing  is  employed  to  revise  access  control  policies  when  unintended  ac¬ 
cesses  are  detected  [39-41],  Another  line  of  work  uses  logical  methods  for  enforcing  a 
class  of  policies,  which  cannot  be  enforced  using  preventive  access  control  mechanisms, 
based  on  evidence  recorded  in  audit  logs  [42].  Cheng  et  al.  [43,44]  extend  access  con¬ 
trol  to  by  allowing  agents  access  based  on  risk  estimations.  A  game-theoretic  approach 
of  coupling  access  control  with  audits  of  escalated  access  requests  in  the  framework 
of  a  single-shot  game  is  studied  by  Zhao  et  al.  [45].  These  works  are  fundamentally 
different  from  our  approach.  We  are  interested  in  scenarios  where  access  control  is  not 
desirable  and  audits  are  used  to  detect  violations.  We  believe  that  a  repeated  game  can 
better  model  the  repeated  interactions  of  auditing. 

Risk  Management  and  Data  Breaches:  Our  work  is  an  instance  of  a  risk  management 
technique  [12, 13]  in  the  context  of  auditing  and  accountability.  As  far  as  we  know,  our 
technique  is  the  first  instance  of  managing  risk  in  auditing  using  a  repeated  game  for¬ 
malism.  Risk  assessment  has  been  extensively  used  in  many  areas  [10, 11];  the  report 
by  American  National  Standards  Institute  [15]  provides  a  risk  assessment  mechanism 


for  healthcare.  Our  model  also  models  data  breaches  that  happen  due  to  insider  attacks. 
Reputation  has  been  used  to  study  insider  attacks  in  non-cooperative  repeated  games 
[46];  we  differ  from  that  work  in  that  the  employer-employee  interaction  is  essentially 
cooperative.  Also,  the  primary  purpose  of  interaction  between  employer  and  employee 
is  to  accomplish  some  task  (e.g.,  provide  medical  care).  Privacy  is  typically  a  secondary 
concern.  Our  model  captures  this  reality  by  considering  the  effect  of  non-audit  interac¬ 
tions  in  parameters  like  Pf.  There  are  quite  a  few  empirical  studies  on  data  breaches 
and  insider  attacks  [16, 19,28]  and  qualitative  models  of  insider  attacks  [47].  We  use 
these  studies  to  estimate  parameters  and  evaluate  the  predictions  of  our  model. 


8  Conclusion  and  Future  Work 

First,  as  public  policy  and  industry  move  towards  accountability-based  privacy  gov¬ 
ernance,  the  biggest  challenge  is  how  to  operationalize  requirements  such  as  internal 
enforcement  of  policies.  We  believe  that  principled  audit  and  punishment  schemes  like 
the  one  presented  in  this  paper  can  inform  practical  enforcement  regimes.  Second,  a 
usual  complaint  against  this  kind  of  risk  management  approach  is  that  there  isn’t  data 
to  estimate  the  risk  parameters.  We  provide  evidence  that  a  number  of  parameters  in 
the  game  model  can  be  estimated  from  prior  empirical  studies  while  recognizing  the 
need  for  more  scientific  studies  with  similar  goals,  and  suggest  specific  studies  that  can 
help  estimate  other  parameters.  Third,  our  model  makes  an  interesting  prediction  that 
merits  further  attention:  it  suggests  that  we  should  design  interventions  that  increase  the 
expected  loss  from  external  detection  of  violations  significantly  more  than  the  expected 
loss  from  internal  detection. 

While  our  model  captures  a  number  of  important  economic  considerations  that  in¬ 
fluence  the  design  of  audit  mechanisms,  there  is  much  room  for  further  refinement. 
For  example,  the  model  does  not  handle  colluding  adversaries  nor  does  it  account  for 
detection  of  violations  in  audit  rounds  other  than  the  one  in  which  the  violation  was 
committed.  Also,  our  treatment  of  accountable  data  governance  leaves  open  questions 
about  the  trade-off  between  utility  maximization  and  privacy  protection.  Moving  for¬ 
ward,  we  plan  to  generalize  our  model,  explore  the  space  of  policy  interventions  to 
encourage  accountable  data  governance,  and  address  normative  questions  such  as  what 
are  appropriate  levels  of  inspections  and  punishments  for  accountable  data  governance. 
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A  Experimental  Outcomes  Supporting  Predictions 
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Table  3.  P,  a  for  varying  C 
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ff 

0.6 

748 

0 

0.85 

790 

0 

1.0 

Table  4.  P,  a  for  I  =  50 


P 

P 

a 

1.0 

10.0 

0.3 

1.2 

9.5 

0.35 

1.3 

9.5 

0.35 

1.40 

9.0 

0.45 

1.5 

9.0 

0.45 

1.6 

8.5 

0.5 

1.7 

8.5 

0.5 

Table  5.  P,  a  for  varying  p 


B  Proofs 


Reminder  of  Theorem  1.  The  simple  commitment  strategy  profile  (defined  above)  is  an 
(e_4,fc,  0 )-SPE  for  the  game  G^,k,  where  e^,k  is 


max 


(maxiU^+J1  -  U°a J,  max([/™+4f  -  U™))  + 


et.h  max 
ae[o,i] 


m—  1  \ 

EWKO  t,j))[Uk,Uk,a\ 

3= 0  / 


Proof.  First  the  easy  case  for  the  employee:  the  employee  always  plays  a  best  response. 
When  deterred  she  is  indifferent  among  any  a4,  so  choice  of  a4  does  not  matter  in  that 
case.  Thus,  there  is  0  benefit  for  the  employee  by  deviating  with  the  history-independent 
strategy  followed.  There  are  two  terms  in  the  p_4.fc  bound  for  the  organization.  The  first 
term  bounds  the  profit  from  deviation  due  to  the  fact  that  the  true  I  is  not  known.  The 
second  term  further  bounds  the  profit  from  deviation  due  to  the  fact  that  the  distribution 
i  ff  is  unknown. 

Note  that  we  have  lifted  the  action  space  of  V  to  commitment  functions.  Thus, 
we  need  to  compare  the  given  commitment  with  other  commitment  functions.  First, 
note  that  if  the  regions  were  known  properly,  and  l)()  known  then  it  is  possible  to  find 
the  commitment  that  is  optimal  cost  for  each  fixed  value  of  a4.  Then,  it  is  enough  to 
bound  the  difference  in  utility  of  the  the  audit  commitment  function  to  this  optimal 
commitment  function  across  all  values  of  a4.  We  perform  the  analysis  for  any  fixed  a4, 
then  taking  the  maximum  over  all  a4  to  bound  the  difference  in  utility  when  V  could 
move  to  the  optimal  commitment.  We  first  compare  the  audit  commitment  to  itself 
when  the  true  regions  are  known,  then  assuming  true  regions  are  known  we  compare 
the  audit  commitment  to  the  optimal  commitment.  Then  using  triangle  inequality  we 
get  the  required  difference  for  a  fixed  a4.  Then  using  the  fact  that  maxx  f(x )  +  g( x)  < 
maxx  f(x)  +  maxj.  g(x)  we  get  the  required  bound  for  all  a4. 

Suppose  the  simple  commitment  strategy  profile  finds  a  point  in  the  region 
The  largest  true  deterred  region  can  be  f?£,  U  Rgj.  Thus,  !  —  Ufiny.  represents  the 
maximum  profit  the  organization  could  have  obtained  by  deviating  to  another  point 


using  the  true  deterred  region  in  simple  commitment  strategy,  with  some  fixed  value 
of  v*  and  at.  Then  the  maximum  taken  over  vt  and  a4  gives  the  maximum  possible 
profit  by  deviation  for  the  deterred  region  if  the  true  deterred  region  were  known.  Sim¬ 
ilar  argument  shows  that  the  maximum  profit  by  deviation  for  non-deterred  region  is 
max„^0i(!7™+M  —  Thus,  the  absolute  maximum  profit  from  deviation  for 

any  region  is  given  by  the  maximum  of  these  two  quantities. 

Next,  assume  that  the  true  deterred  region  Pc  is  known,  so  is  the  non-deterred 
region  Rnd-  We  have  already  show  above  that  the  maximum  profit  from  deviation 
that  the  organization  would  get  using  simple  commitment  strategy  with  P  d  instead  of 
R^Rgj,  and  Rnd  instead  of  Assume  that  the  true  regions  are  known  and 

simple  commitment  strategy  outputs  (P,  a)  to  be  played  by  the  organization.  We  use  the 
simplified  notation  with  the  game  under  consideration  being  Denote  by  f(P,  a) 

the  function  £(Rewp)[0],  by  g(P,  a)  the  function  P(Rew^,)[a4]  and  by  h(P,  a)  the 
function  £(Rewp)[Dg].  The  function  maximized  by  (P,  a)  is 

UD{P,a)  =  (1  -  eth)f{P,a)  +  ethg{P,ct)  , 


in  Pc  and  is 

UNd(P,  a)  =  (1  —  eth)g(P,  o)  +  e th,g{P ,  o) 

in  Rnd •  Suppose  Dq  was  known  and  the  point  ( P' .  a1)  is  obtained  by  maximizing 

U'D(P,a)  =  (1-  eth)f(P,a)  +ethh(P,a) 


in  the  Pc  region  and 


U'ND{P ,  a)  =  (1  -  eth)g(P,  a)  +  etft  /i(P,  a) 

in  the  Rnd  region.  We  emphasize  that  the  function  U'  is  the  true  expected  utility. 
Consider  two  different  cases 

-  (P,  a)  and  fP',  a')  both  lie  in  the  same  region,  say  Pc-  Then,  the  maximum  benefit 
to  be  gained  out  of  deviation  is  U'D(P' ,  a’)  —  U'D(P,  a),  which  is 

(1  -  eth)(f(P',  a1)  -  f(P,  a))  +  eth(h(P' ,  a')  -  h(P,  a)) 

Also,  since  Ud{P ,  a)  >  Ud{P',  ol')  we  have 

£th(g(P ,  cc)  -  g(P',  a1))  >  (1  -  eth)(f(P',  a')  -  /(P,  a)) 

Thus,  the  maximum  benefit  is  upper  bounded  by 

eth  ( g(P ,  a)  -  g(P\  a')  +  h(P' ,  a')  -  h(P,  a))  . 

The  upper  bound  is  same  for  the  non-deterred  case,  since  in  that  case  the  function 
/(., .)  is  replaced  by  g(., .)  in  both  U  and  U'  and  the  exact  same  calculation  as 
above  yields  the  same  bound. 


-  (P,  a)  and  ( P',a ')  both  lie  in  different  regions,  say  Rd  and  Rnd  respectively. 
Then,  the  maximum  benefit  to  be  gained  out  of  deviation  is  U'ND(P' ,  a')—U'D(P,  a ), 
which  is 


(1  -  eth)(g(P',  a')  -  /(P,  a))  +  eth{h{P' ,  a1)  -  h{P ,  a))  . 

Also,  since  Ud{P ,  a)  >  Und(P' ,  a7)  we  have 

eth(g(P,  a)  -  g{P\  a'))  >  (1  -  eth)(g(P\  a')  -  f(P ,  a))  . 

Thus,  the  maximum  benefit  is  upper  bounded  by 

eth  ( g(P ,  a)  -  g(P',  a)  +  h(P' ,  a')  -  /i(P,  a))  . 

Now  suppose  that  (P,  ct)  and  (P',  a')  lie  in  Py D  and  P D  respectively.  Then,  the 
maximum  benefit  to  be  gained  out  of  deviation  is  U'D(P' ,  a1)  —  U’ND{P ,  a),  which 
is 

(1  -  eth)(f(P',a')  -  g(P,  a))  +  eth(h(P',a')  -  h(P,a))  . 

Also,  since  Und(P,  a)  >  Ud(P',  a ')  we  have 

£th(g(P,a)  -  g(P',a'))  >  (1  -  eth)(f(P' ,a')  -  g(P,a))  . 

Thus,  the  maximum  benefit  is  upper  bounded  by 

eth  ( g(p ,  a)  -  g{P',  a')  +  h(P',  a')  -  h(P ,  a))  . 

The  above  cases  show  that  the  upper  bound  for  profit  from  deviation  in  one  round  is 
always 

eth  ( g(P ,  ot)  -  g{P\  a')  +  h(P' ,  a')  -  h(P ,  a))  . 

Using  definition  of  expected  rewards  we  have 

g(P,a)  =  —C(atat)  —  e(Pt) 

m— 1 

MP,a)  =  -  e(P‘)  -  £  <5^(P(r( 0*,j))) 

i=o 

Note  that  for  any  P,  a 

m— 1 

h(P,a)  -  g(P,a)  =  -  E  5°oEDt0(E{r(O\j)))  <  0  , 

j=0 

thus,  the  upper  bound  above  is  further  bounded  by 

eth  (g(P,a)  -  h(P,a))  , 


m—  1  \ 

E  SJT>EDt(E(r( 04,j))) 

i=o  / 


which  is  given  by 


Observe  that  the  above  term  is  maximized  over  choice  of  Dq  when  Dq  places  all  prob¬ 
ability  mass  on  a 1  (for  any  a),  i.e.,  if  =  a*.  Also,  the  expected  value  of  r  should  be 
increasing  in  v 4  (since  higher  if  means  higher  detected  violations),  and  vt  =  a 4  takes  a 
maximum  value  of  Uk  for  game  C  AA.  Thus,  the  above  term  is  upper  bounded  by 

(m—  1 

V  S^,E(r(Ot,j))[Uk,Uk,  a] 

Now,  add  the  two  bounds  to  get  maximum  profit  from  deviation  in  one  round.  Fur¬ 
ther,  using  Property  1  and  noting  that  the  strategy  is  history  independent  for  C  A,k  we 
now  obtain  the  desired  result. 

Claim.  Assume  function  instantiations  from  Section  ??.  Thus,  given  v(a)  =  pa  — 
{p  —  l)a2,  we  must  have  /_ i  <  2.  Further,  assuming  C  +  2(P,„t  —  Rextp)  >  0  and 
Rint  <  RextP ,  the  eAA  from  Theorem  1  is  given  by  ethUk  ma x(Rint,  Rextp)  +  AIA>k, 
where  AIa  i-  is 

2  ST  •  (e  UkC  \ 

\P'  rii-p){i- si) ) 

Using  values  from  end  of  Section  3  we  can  get  ethUk  max(i?jnt,  Rextp)  =  0.03  * 
40  *  150  =  180,  also,  the  minimum  in  AIA  k  is  for  e/p  =  20.  Assuming,  61  =  0.5 
(remember  z0  =  1,  assume  the  learning  reduces  region  of  uncertainty  by  half),  we  have 
AIA,k  =  20.  Thus,  we  get  eA  k  =  $200. 

Proof.  Remember  that  Note  that  for  v(a)  <  1  to  hold,  it  must  be  that  pa—  (p—  \)a2  < 
1  for  a  £  [0, 1],  It  can  be  readily  verified  that  this  happens  only  when  p  <  2.  Remember 
the  linear  functions  assumption  means  C{st)  =  Cs 1  and  e{Pt)  =  eP*. 

m— 1 

max  ^  63vE{r{Of  j))[Uk,Uk,a]  = 
ae[o,t]  ^ 

UkRextP  T  Uk  max  (Rint  Rex±p)v{^a) 
ae[0,l] 

The  relevant  part  to  maximize  can  be  expanded  as 

( Ri  nt  Rextp)  {peX  i^p  l)tt  ) 

For  p  <  2,  {pa  —  (p  —  l)a2)  increases  with  a  £  [0, 1]  (derivative  is  positive).  Thus,  if 
Rint  >  RextP  then  a  =  1  is  the  maximizer  else  a  =  0  is  the  maximizer.  Then,  it  is  not 
difficult  to  conclude  that  the  maximum  value  is  Uk  ma x(Rintl  Rextp). 

Now  observe  that,  since  p  <  2,  p  >  p{a)  >  1  The  utility  function  maximized  by 
the  organization  given  the  linear  function  and  the  example  reputation  function  is  (using 
simple  notation) 

ethRextP tt  eP  a  CL  C  v{a  )ct  Cth{Rint  RextP)  ? 

-Rextpcf  -  eP*  -  atatC  -  v{at)at{Rint  -  Rextp) 


in  the  reduced  deterred  region  and  reduced  non-deterred  region  respectively.  Observe 
that  for  the  non-deterred  case,  using  assumption  C  +  2{Rint  —  Rextp)  >  0  implies  C  + 
H{a){Rint  —  RextP)  >  0,  since  2  >  p  >  p{a)  >  1  and  all  quantities  C,  Rint  and  Rextp 
are  positive.  Thus,  the  maximizer  in  non-deterred  region  is  always  0,  0,  irrespective  of 
the  value  of  /,  hence  the  difference  in  costs  is  zero  for  the  cases  when  I  is  known 
perfectly  and  when  there  is  an  error  51. 

Assume  occurs  for  a  point  P' ,  a'  and  happens  for  a  point  P,  a,  and 

learned  valued  of  personal  benefit  is  I.  The  interesting  case  is  when  P' .  a!  ^  P,  a  and 
P,  a  lies  on  the  curve  defined  by  I  +  51.  Then  suppose  P',  a!  lie  on  the  curve  defined  by 
I  +  51  —  (  for  2 51  >  £  >  0.  Suppose  P',  a"  and  P",  a!  are  points  on  the  curve  defined 
by  I  +  51  obtained  by  drawing  straight  lines  from  the  point  P' ,  a' .  Thus,  P'  <  P"  and 
a'  <  a".  Note  that  since  P' (v(ar)  +  p{  1  —  v(a')))  =  I  +  51  —  £  and  v{a),p  <  1,  we 
can  claim  that  P'  >  I  —  51.  Then  we  have 

C  =  P"(v(a)  +p{  1  —  v(a)))  —  P'{v{a')  +  p(  1  -  v{a'))) 


P"  -  P'  = 


c 


< 


c 


Also, 


v{a')(l  —p)+p  P 
C  =  P'(v(a")  +p(  1  -  u{a")))  -  P'(v(a')  +  p(  1  -  v{a'))) 


or 


Note  that 


v(a")  —  = 


c 


< 


c 


(1  -p)P'  -  (1  -p)(I-5I) 


v(a")  -  v[a')  =  p{a"  -  a')  -{pi-  1  ){a"  -  a'){{a"  +  a')) 
thus,  v{u")  —  v{ct')  >  p{a"  —  a')  and  hence 


{a" 


a')  < 


c 

M(i  -p){i-SI) 


Also,  UD{P,  a)  >  UD{P",  a ')  and  UD{P ,  a)  >  UD{P',  a")  and,  U°+SI  -  P^ax  = 
UD+SI{P\  a')  —  UD{P ,  a)  means  that 

tCc"  -  <  min {UD+SI{Pf,  a')  -  UD{P ",  a'), 

UD+SI(P',a')  -  UD(P',a ")) 


Also,  UD+SI{P',  a')  —  UD{P",  a')  is  given  by 

—e{P'  -  P")  <  — 
P 

Also,  UD+SI{PI ,  a')  —  UD{P',  a")  is  given  by 


a^a'  -  a")C  -  at(iy{a')  -  v{a"))eth{Rint  -  RextP ) 


which  can  be  simplified  to 

a? (a"  -  a')(C  +  (/z  -  (/z  -  l)(a"  +  a'))eth{Rint  ~  Rextp)) 

Using  result  1  <  /z  <  2,  we  have  2  >  /z  —  (/z  —  l)(a  +  a')  >  0.  Using  assumption  C  + 
2 (Rint-RextP)  >  0  we  can  say  that  (C+(p-  {p-l)(a+a'))eth(Rint- Rextp))  >  0. 
Also,  since  Rint  <  Rextp,wehave(C+(p-(p-l)(a+a'))eth(Rint-RextP))  <  C. 
Thus,  using  the  inequalities  above  and  c  <  2 SI,  U°+SI  -  P£ax  is  less  than 

.  (e  afC  \ 

\P  /z(l  -p)(I-6I)J 

which  is  maximized  for  a*  =  Uk- 

Reminder  of  Theorem  2.  Assume  that  budget  is  fixed  in  every  round  and  is  sufficient 
to  deter  A,  and  the  number  of  tasks  performed  by  A  in  every  round  in  fixed.  Let  v*  be 
the  maximum  PPE  payoff  that  V  can  obtain.  Further  suppose  there  exists  a  PPE  Em  in 
which  T>  always  plays  some  action  in  the  deterred  region  and  the  utility  for  T>  with  Em 
is  v*.  Then  a  committed  utility  maximizing  and  privacy  preserving  T>  will  choose  to  play 
Em.  Further,  the  action  in  Em  coincides  with  the  action  chosen  by  simple  commitment 
strategy  profile  in  each  round. 

Proof.  If  V  always  plays  deterred  points  in  Emax  then  the  best  option  A  has  is  to 
perform  0  violations  (with  an  etf,  probability  of  maximum  violations).  Let  P,  a  be  the 
point  in  the  deterred  region  that  provides  highest  payoff  to  V  (under  0  violations  and  eth 
probability  of  maximum  violations).  In  Emax ,  V  will  always  plays  P,  a.  This  because 
playing  any  other  point  in  the  deterred  region  results  in  a  profitable  single  stage  devia¬ 
tion  by  switching  to  P,  a.  Since  Emax  is  a  PPE,  there  must  exist  a  punishment  strategy 
(low  continuation  payoff)  that  makes  switching  to  a  non-deterred  point  non-profitable 
for  V  [22], 

Hence  the  total  discounted  payoff  in  Emax  is  Ud(P,  ol,  0),  thus,  Ud(P,  a,  0)  =  v*. 
A  commitment  based  utility  maximizing  V  would  prefer  a  PPE  with  the  highest  payoff, 
and  further  assuming  she  chooses  the  best  privacy  preserving  PPE  among  those,  V 
would  want  the  Emax  PPE  to  be  played.  V  will  ensure  Emax  is  played  by  committing 
to  its  strategy.  Thus,  for  the  class  of  PPE  strategies  the  point  P,  a  will  be  chosen  with 
the  assumptions  of  the  theorem. 

Now,  we  claim  that  with  the  simple  commitment  strategy  the  same  point  P,  a  will  be 
chosen  by  V.  To  prove  this,  suppose  on  the  contrary  that  a  point  P',  a'  is  chosen  in  the 
non-deterred  region.  (Note  that  no  other  point  in  deterred  region  can  be  chosen  -  because 
P,  a  is  the  point  in  the  deterred  region  that  provides  highest  payoff  to  the  V).  Let  the 
maximum  number  of  violations  be  a.  Then  it  must  be  the  case  that  Und(P',  a> ■  a)  > 
Ud(P,  Q;,  0).  Also,  (P',  a'),  a  is  a  Nash  equilibrium  of  the  stage  game.  This  is  easily 
proven  by  observing  that  a  number  of  violations  maximizes  A’s  utility  when  P',  a'  is 
played  by  V  and  P' ,  a!  maximizes  P’s  utility  when  a  violations  are  committed  by  A. 
A  Nash  equilibrium  played  in  all  rounds  is  a  PPE  [22],  Then  the  PPE  that  results  from 
playing  the  above  NE  in  all  rounds  yields  payoff  Und(P' ,  cf ,  a),  but,  by  assumption 
that  Ud{P,  a,  0)  =  v*,  we  obtain  that  this  PPE  payoff  is  greater  than  v*.  This  is  a 
contradiction  as  v*  is  the  maximum  payoff  that  V  can  obtain  in  any  PPE.  Thus,  with 
our  strategy  P,  a  will  be  chosen  by  V. 


B.l  Repeated  Product  Game  -  Definition  and  Results 


If  two  players  play  multiple  (repeated)  independent  games  in  parallel  then  it  is  possible 
to  consider  a  composition  of  these  games  which  is  itself  a  (repeated)  game.  By  indepen¬ 
dent  games  we  mean  that  these  games  are  played  without  any  influence  from  the  other 
games  in  parallel.  We  define  the  composition  below  for  a  repeated  game. 

Definition  3.  (Repeated  Product  Games )  Let  the  two  players  play  the  independent  one- 
shot  stage  games  G 1,  G 2...,  Gn  in  parallel  in  each  round  of  the  corresponding  n  re¬ 
peated  games.  A  composition  of  the  n  stage  games  is  a  single-shot  game  G  given  by 
player  i’s  (i  =  1,2)  action  space  Si  =  SI*  x  S'2j...  x  5Vi,,  and  the  payoff  function 
r,(si,  sf)  =  Yl'j- i  ri(sJ  i)  sji)  where  sji  €  Sji  and  Si  €  Si.  A  repeated  product  game 
is  a  repeated  game  with  the  stage  game  in  every  round  given  by  G. 

We  can  extend  the  above  definition  to  games  with  imperfect  monitoring  and  public 
signaling,  similar  to  the  manner  in  which  a  standard  repeated  game  is  extended.  Ob¬ 
serve  that  any  strategy  a  of  a  repeated  product  game  can  be  decomposed  into  strategies 
crl,...,crn  of  the  component  games,  because  of  the  independence  assumption  of  the 
component  games.  This  decomposition  leads  to  the  following  useful  results  summa¬ 
rized  in  the  lemma  below: 

Lemma  1.  Let  RG  be  a  repeated  product  game  with  the  stage  game  given  by  G,  such 
that  G  is  a  parallel  composition  of  G 1,  G 2...,  Gn  as  defined  in  Definition  3.  Consider 
a  strategy  o  of  RG  with  the  decomposition  into  strategy  aifor  each  component  game. 
Then 

-  The  strategy  a  is  a  SPE  iff  the  strategy  oi  is  a  SPEfor  the  repeated  game  with  stage 
game  Gifor  all  i. 

-  The  strategy  o  is  an  (X^=  l  et*>  !C"=i  e2  i)-SPE  if  the  strategy  oi  is  a  (ei  i,  e2‘i)-SPE 
for  the  repeated  game  with  stage  game  Gi  for  all  i. 

Proof.  For  the  first  case  assume  oi  is  a  SPE  for  the  repeated  game  with  stage  game 

Gi  for  all  i.  Since  o  is  given  by  ol . on  any  unilateral  deviation  from  o  results  in 

a  unilateral  deviation  from  one  or  more  of  ol, ...,  on,  suppose  it  is  oj.  By  assumption 
that  is  not  profitable  for  repeated  game  given  by  the  stage  game  Gj.  Since  the  payoff 
in  G  is  the  sum  of  payoffs  in  GI, ...,  Gn  and  payoffs  of  games  other  than  jth  game 
remains  same,  the  deviation  is  not  profitable  for  G  also. 

The  other  direction  is  very  similar.  Assume  o  is  a  SPE.  Since  er  is  given  by  crl, ...,  on 
any  unilateral  deviation  from  oj  results  in  a  unilateral  deviation  from  er.  By  assumption 
that  is  not  profitable  for  repeated  game  given  by  the  stage  game  G.  Since  the  payoff 
in  G  is  the  sum  of  payoffs  in  GI, ...,  Gn  and  payoffs  of  games  other  than  jth  game 
remains  same,  the  deviation  is  not  profitable  for  Gj  also. 

Next,  for  the  second  part  since  the  payoff  of  G  is  the  sum  of  payoff’s  of  Gi' s  and 
any  any  unilateral  deviation  from  o  results  in  a  unilateral  deviation  from  one  or  more 
of  crl,  ...,on,  then  it  is  not  difficult  to  check  that  the  profit  from  deviation  will  not 
more  than  the  sum  of  profit  from  deviation  in  each  of  the  repeated  games  defined  by 
GI, ...,  Gn.  Thus,  the  maximum  profit  from  deviation  for  player  j  is  cpi. 


B.2  Determining  Pf  and  Punishment 


In  addition  to  the  action  dependent  utilities  above,  the  players  also  get  an  fixed  util¬ 
ity  in  each  round  of  Q a,  which  is  the  salary  Sal  a  for  A  and  the  value  created  by  the 
employee  g  x  Sal  a  for  T>.  Note  that  this  is  the  salary  and  value  created  for  the  dura¬ 
tion  of  one  audit  cycle.  Also,  note  that  this  fixed  utility  is  not  part  of  any  game  Qa,u- 
Let  Rk  be  the  maximum  loss  of  reputation  possible  for  violation  of  type  k  (when  all 
tasks  are  violations  that  are  externally  detected).  We  assume  that  the  maximum  pun¬ 
ishment  Pf.A^k)  rate  for  each  type  k  is  proportional  to  Rk-  Since  the  employee  can 
make  mistakes,  in  the  worst  case  he  can  lose  an  expected  amount  of  eth  Ylu  Pf  AWk. 
This  loss  must  be  less  than  a  fixed  fraction  net  of  Sal  a*  or  else  the  employee  is  bet¬ 
ter  off  quitting  and  getting  betters  expected  payoff  in  every  round  in  some  other  job. 
Thus,  we  must  have  eth  Pf.A{k)Uu  =  net  ■  Sal  a-  which  yields  a  value  Pf  A  (k)  = 
Runet  ■  SalA/i^th  PkUu)-  Observe  that  an  employee  with  higher  salary  can  be 
punished  more.  For  example,  suppose  A  does  two  types  k,  k'  of  tasks  such  that  in  every 
week  Uu  =  40,  Uw  =  400  and  Rw  =  0.5 Ru  and  net  =  0.15  with  weekly  salary  $500. 
Then,  Pf(k)  =  10.4  and  Pf{k')  =  5.2. 

Next,  consider  the  case  the  the  employee  is  non-deterred  for  violations  of  type  k. 
Then  suppose  the  expected  loss  to  the  organization  in  every  round  for  such  a  case  is 
maximum  of  UuLu,  where  L;.  is  maximum  per  violation  cost  (dependent  on  a)  that  can 
be  calculated  from  our  model.  In  such  a  case  if  it  happens  that  UkL  >  (g  —  1  )SalA 
then  the  organization  obtains  no  benefit  from  employing  A.  Thus,  in  such  a  case  the 
organization  must  fire  the  employee. 


